
Cocojunk
🚀 Dive deep with CocoJunk – your destination for detailed, well-researched articles across science, technology, culture, and more. Explore knowledge that matters, explained in plain English.
Self-modifying code
Read the original article here.
The Forbidden Code: Self-Modifying Code - Altering Reality While Running
Welcome, fellow explorer of the digital underworld. In the realm of programming, much of what you learn in standard courses focuses on clean, predictable structures: functions call other functions, data is processed, and control flows through well-defined paths. But beneath this surface lies a world of techniques that break the rules, offering immense power but also significant danger and complexity. One such technique, a true relic of necessity turned forbidden art, is Self-Modifying Code (SMC).
This resource will pull back the curtain on SMC, exploring what it is, how it works, why you might (or might not) use it, and the profound implications it has for system architecture, security, and the very nature of executable instructions.
What is Self-Modifying Code?
Self-Modifying Code (SMC or SMoC): Executable code that intentionally alters its own instructions in memory while it is running.
At its core, SMC is exactly what it sounds like: a program that changes its own machine code or source code during its execution. Imagine a craftsman who can reshape his tools – or even the blueprint he's following – while in the middle of building something. That's SMC.
It's crucial to distinguish intentional self-modification from accidental memory corruption, such as a buffer overflow overwriting nearby code. SMC is a deliberate programming technique.
The primary motivations for using SMC have historically included:
- Performance Optimization: Reducing the number of instructions executed in a critical path.
- Code Reduction: Eliminating repetitive code sequences by dynamically adjusting a single, more general sequence.
- Flexibility: Adapting program behavior based on runtime conditions in ways difficult or inefficient with static code.
While less common in everyday application programming today due to its complexities and conflicts with modern system designs, understanding SMC is vital for comprehending low-level system behavior, historical computing challenges, and certain advanced or specialized domains.
How Does Self-Modifying Code Work?
SMC isn't a single trick; it encompasses various methods to achieve the goal of changing the execution path or logic based on runtime state. These methods can be broadly categorized:
- Overwriting Existing Instructions: The program identifies a specific instruction (or part of an instruction, like an opcode, register, or address) in its own code segment in memory and replaces it with different bytes. This is the most direct form.
- Generating New Instructions: The program allocates a buffer in memory, constructs a sequence of machine code instructions byte by byte within that buffer, and then transfers execution control to this newly generated code.
- Indirect Modification: The code itself isn't directly overwritten, but the program changes pointers or addresses that determine which code is executed next. This often involves modifying function pointers or jump targets stored in data areas, effectively rerouting execution without altering the target code itself. While technically not modifying the code, it achieves a similar effect of dynamic control flow alteration and is often discussed alongside SMC.
These modifications can occur at different times:
- During Initialization: Based on input parameters or detected system configurations when the program starts. This is sometimes seen as a form of advanced 'software configuration' rather than true dynamic SMC, analogous to setting jumpers on a hardware board.
- Throughout Execution ("On the Fly"): Modifications happen dynamically based on reaching specific program states or encountering certain conditions during runtime. This is the more complex and often performance-driven form of SMC.
Regardless of the timing or method, the core idea is that the instructions being fetched and executed by the CPU are not fixed from the moment the program loads; they can change based on the program's own actions.
Implementing Self-Modifying Code: Low-Level vs. High-Level
The ease and nature of implementing SMC depend heavily on the programming language and the system's architecture.
Low-Level Languages: The Assembly Arena
Assembly language is the most natural environment for SMC because you are directly manipulating machine instructions. You have fine-grained control over memory addresses and the bytes that represent opcodes, registers, and operands.
Direct Overwriting: You can calculate the memory address of an instruction within your own code segment and use standard move or store instructions to change its bytes.
Example: IBM System/360 Initialization Optimization
Consider a subroutine that performs some setup (like opening a file) only on the first call, and then skips that setup on subsequent calls. A common pattern without SMC is to use a flag:
SUBRTN: TEST_FLAG Branch_if_flag_set, SKIP_SETUP SETUP_CODE: OPEN_FILE ; Open the file (only needed once) SET_FLAG ; Set the flag so we skip next time SKIP_SETUP: ... ; Normal processing ... BR Return_address
Every time
SUBRTN
is called, theTEST_FLAG
instruction and the conditional branch are executed. With SMC, this overhead can be eliminated after the first call. The initial code would look like this:SUBRTN: NOP ; Initially a No-Operation instruction (e.g., opcode 47, register 0, addr x'...) OPENED: OPEN ... ; Open the input file ... ; Normal processing resumes here FIRST_CALL_LOGIC: ... ; Code executed *before* branching to SUBRTN the first time ... ; Modify the NOP instruction at SUBRTN to become an unconditional branch ; On S/360, a common unconditional branch opcode is 47 with register F (15) ; The instruction format for conditional branch/NOP is Opcode (4 bits) | Reg (4 bits) | Address (8 bits) ; NOP (4700...) becomes Unconditional Branch (47F0...) by changing the register field. OI SUBRTN+1, X'F0' ; OR Immediate instruction: OR the second byte of SUBRTN with hex F0. ; This changes the '00' (register 0) part of the NOP's second byte to 'F0' (register 15), ; turning the NOP into an unconditional branch (BCR 15, address) which is often used as BALR 14,15 or BR 14 ; depending on how the address is formed. For a simple branch, BC 15 (opcode 47) is used. BAL R14, SUBRTN ; Now call the subroutine. The first time, the NOP does nothing. ; The code starting at OPENED executes. ; The OI instruction is *outside* the SUBRTN itself, or executed by a setup routine.
On subsequent calls to
SUBRTN
, the instruction atSUBRTN
is no longer aNOP
, but an unconditional branch that jumps directly toOPENED+length_of_open_instruction
, effectively skipping theOPEN
instruction itself and jumping straight into the "Normal processing". TheOI
instruction only needed to be executed once. This reduces the instruction path length by eliminating theTEST_FLAG
and conditional branch on every subsequent iteration.Creating New Instructions: You can write assembly code that builds machine code sequences in a data buffer. For instance, calculating addresses or values at runtime and then embedding them directly into generated instructions like
MOVE
,LOAD
, orSTORE
.Example: Overcoming Instruction Set Limitations (Intel 8080)
The Intel 8080 processor had an instruction
IN port
which reads a byte from an I/O port specified statically in the instruction itself. You couldn't use a value in a register to specify the port number. If you needed to read from a port whose number was determined at runtime (say, stored in register B), you couldn't doIN B
.The
IN port
instruction is two bytes long:DB input_opcode, port_number
. To read from a dynamic port, you could:DYNAMIC_IN: IN 00h ; This is the instruction we will modify. The '00h' byte will be replaced. RETURN: ... ; Code execution continues after the IN instruction ; To read from port number stored in register B: MOV A, B ; Get the port number from B into A STA DYNAMIC_IN+1 ; Store the value in A into the second byte of the IN instruction CALL DYNAMIC_IN ; Execute the modified instruction
The
IN 00h
instruction is a placeholder. Before executing it, the program writes the desired port number from register B into the second byte of the instruction (DYNAMIC_IN+1
). Then, whenCALL DYNAMIC_IN
is executed, the processor fetches the instructionIN [value_of_B]
, achieving the dynamic port input.Cache Considerations: On modern processors, SMC can be tricky due to CPU caches (specifically instruction caches). If you modify code that is currently in the instruction cache, the processor might execute the old version from the cache instead of the new one from memory. Programmers must explicitly issue instructions (like
flush data cache
andinvalidate instruction cache
) for the modified memory region to ensure the processor fetches the updated code. Failure to do this leads to unpredictable and often incorrect behavior. This overhead can sometimes negate the performance benefits of SMC, especially if modification happens frequently.
High-Level Languages: The Illusion of Change
Most modern compiled high-level languages (like C, C++, Java, Rust) do not directly support modifying their own machine code after compilation. This is by design, promoting safety, portability, and static analysis. However, some languages and environments offer features that provide an illusion of self-modification or achieve similar dynamic behavior:
Dynamic Interpretation (
eval
): Languages witheval
functions (like Python, Perl, JavaScript, PHP) allow programs to take a string containing source code, parse and execute it at runtime. While the interpreter is running new code, it's not typically mutating previously loaded code in place. It's more like generating and running a new mini-program or function.Example: JavaScript Function Pointer Modification
This example shows changing behavior by swapping function pointers, not modifying machine code directly, but achieving a similar dynamic effect.
let behavior = function() { console.log("First behavior"); }; function performAction() { behavior(); // Call through the pointer } performAction(); // Output: First behavior // Now, "modify" the behavior by changing the pointer behavior = function() { console.log("Second behavior"); }; performAction(); // Output: Second behavior
This isn't true SMC at the machine code level, but it demonstrates dynamic behavior change based on program state, which is one goal sometimes addressed by SMC.
Source Code Modification & Re-Interpretation: Some older or niche interpreted languages (like SNOBOL, some batch scripting) might execute directly from a text representation in memory that the program can edit.
Example: DOS Batch File "Menu System"
Consider a simple command-line menu system implemented using a DOS batch file (
MENU.BAT
) and an executable (SHOWMENU.EXE
). The batch interpreter reads the file line by line.REM MENU.BAT initially :start SHOWMENU.EXE REM Place for selected command call
SHOWMENU.EXE
runs, displays a menu, and the user selects an action (e.g., runSOMENAME.BAT
). Instead of runningSOMENAME.BAT
directly and staying in memory,SHOWMENU.EXE
rewrites theMENU.BAT
file before exiting:REM MENU.BAT after selection :start SHOWMENU.EXE CALL SOMENAME.BAT GOTO start
When
SHOWMENU.EXE
exits, the DOS interpreter reads the next line from the now-modifiedMENU.BAT
, which isCALL SOMENAME.BAT
. AfterSOMENAME.BAT
finishes, the interpreter executes the line after that:GOTO start
, which takes it back to the beginning of the file, restartingSHOWMENU.EXE
. When the user selects "Quit",SHOWMENU.EXE
rewrites the file back to its original state (or removes theCALL
andGOTO
).This technique isn't modifying executable machine code, but it modifies the script source that the interpreter is reading line-by-line, achieving dynamic control flow based on runtime state. The label
:start
or equivalent padding might be needed to ensure the interpreter reads from the correct byte offset in the modified file. A key advantage here was saving memory;SHOWMENU.EXE
didn't need to stay loaded whileSOMENAME.BAT
ran.Dynamic Code Generation (JIT-like): Techniques like Lisp macros or Just-In-Time (JIT) compilation involve programs generating new code (often in an intermediate representation or even machine code) at runtime, but this is distinct from modifying existing compiled code in place. JIT compilers analyze and compile performance-critical sections of code dynamically, which shares some goals with SMC (runtime optimization) but uses a different mechanism (compiling new code, not patching existing compiled code).
Reflective Programming: Some languages allow inspecting and manipulating program structure, including defining new classes or methods at runtime. This is a higher-level form of dynamic behavior modification.
While high-level languages rarely support true SMC, they offer alternative paradigms like polymorphism, reflection, and dynamic interpretation to achieve adaptable behavior.
Other Indirect Forms
- Control Tables: Programs can use large data tables containing parameters, function pointers, or flags that control the flow of a general-purpose interpreter or dispatcher loop. By changing values in the data table, the program changes its behavior, even though the core interpretation logic remains static. This is often seen in transaction processing systems.
- Channel Programs (IBM): In IBM mainframes, channel programs are lists of commands executed by a dedicated I/O processor. These programs could instruct the channel to read data (like a disk address) into a buffer within the channel program itself, which would then be used by a subsequent command (like Read Disk) in the same channel program. This is a form of self-modification at the I/O processor level.
A Glimpse into History
SMC wasn't always considered "forbidden." In the early days of computing, memory was extremely limited, and processors were much simpler, often lacking robust instruction sets or hardware support for things like subroutine calls or index registers.
- Early machines like the IBM SSEC (1948) could treat instructions as data.
- SMC was used to implement subroutine calls and returns when hardware support was absent, by dynamically writing the return address into a jump instruction at the end of the subroutine.
- It was a common technique for tight optimization loops or saving precious memory by having one piece of code serve multiple slightly different purposes by modifying itself.
- Donald Knuth's theoretical MIX architecture in The Art of Computer Programming uses SMC for subroutine calls, reflecting practices of the time.
As hardware evolved with more complex instruction sets, larger memories, and features like index registers and hardware stacks for subroutines, the necessity for SMC for basic tasks diminished. However, its power for deep optimization and dynamic behavior persisted.
Why Use the Forbidden Code? (Advantages and Use Cases)
Despite its challenges, SMC offers unique capabilities that make it valuable in specific contexts:
- Ultimate Optimization: By eliminating conditional checks in critical loops, SMC can offer the fastest possible execution path for a given state. The state-dependent loop example is a classic illustration.
- Runtime Specialization: Code can be generated or modified at load time or runtime to be highly specific to the exact data or environment it will process.
- Example: A sorting routine that needs to compare elements using a specific, complex key structure. Instead of calling a generic comparison function (with call overhead) or checking key fields inside the loop (with branching overhead), the program can dynamically build the comparison logic directly into the sort loop's code, hardcoding the field offsets and types based on the data structure being sorted. This is similar to JIT compilation but can be applied more manually.
- Example: Real-time graphics or signal processing where algorithms can be heavily optimized based on input parameters (e.g., filter kernel size, data format).
- Overcoming Instruction Set Limitations: As seen with the Intel 8080 example, SMC can be used to perform operations that are not directly supported by the architecture's fixed instructions. This is also relevant for theoretical architectures like the One-Instruction Set Computer (OISC).
- Code Camouflage and Protection: SMC makes static analysis difficult. Disassemblers see the code in its initial state, which might be obfuscated or simply not the code that actually runs. Debuggers can struggle to follow execution if the code they are stepping through suddenly changes.
- Use Case: Copy protection in old software, malware (viruses, shellcode) trying to avoid detection by signature-based scanners or analysis tools. Polymorphic code engines use SMC to decrypt and then potentially mutate a piece of malicious code before executing it.
- Low-Level System Bootstrapping: Early bootloaders, especially on microcomputers, often used SMC because they were small, needed to perform hardware-specific tasks, and memory was extremely tight. Even modern bootloaders sometimes use self-relocation techniques.
- Fault Tolerance: In some highly specialized systems, code might dynamically patch itself to work around detected hardware faults or software errors.
- Evolutionary Computing: Systems designed to evolve code (like genetic programming) may use SMC or SMC-like mechanisms to allow the program to literally change its own structure and behavior over time, with changes being kept if they improve performance according to a fitness function (as explored by researchers like Jürgen Schmidhuber).
- Operating System Adaptation: The Linux kernel uses SMC in specific places (e.g.,
alternative
instructions) to adapt the running kernel image based on the specific CPU features detected at boot time, allowing a single kernel binary to be optimized for various processor generations within an architecture family. The Synthesis kernel by Alexia Massalin took SMC to an extreme, dynamically generating code optimized for specific runtime objects.
The Downside: Challenges and Risks
If SMC is so powerful, why isn't it common practice? The reasons are numerous and significant:
- Readability and Maintainability: SMC is notoriously difficult to understand and debug. The code you read in the source listing is not the code that executes. Tracing execution requires understanding not just the control flow but also the data modifications that change the code itself. This dramatically increases development and maintenance costs.
- Security Vulnerabilities: SMC can conflict with fundamental security principles.
- W^X (Write XOR Execute): Modern operating systems enforce policies (like W^X) where memory pages are either Writable XOR Executable. A page cannot have both permissions simultaneously. This prevents attackers from injecting malicious code into a data buffer (writable) and then executing it. True SMC requires modifying instructions in an executable memory region, which conflicts directly with W^X. Circumventing W^X is a common goal of exploit development.
- Code Signing and Authentication: SMC breaks code signing. If a program's executable pages are signed to verify their integrity, any modification invalidates that signature. Systems enforcing code signing policies (common for drivers, OS components, and secure applications) often cannot run SMC.
- Malicious Modification: While SMC can be used for camouflage, it also makes a program vulnerable to malicious SMC by external agents (like buffer overflows or other injection attacks) if memory protections aren't perfectly implemented.
- Performance on Modern CPUs: While SMC aims for performance, modern CPU features can turn it into a penalty.
- Cache Coherency: As mentioned, modifying code in the instruction cache requires explicit cache flushing/invalidation, which is a relatively slow operation. If modification happens frequently, the overhead can outweigh any benefit.
- Instruction Pipelines: Modern processors use pipelines to fetch and decode instructions ahead of execution. If an instruction changes just before it's needed in the pipeline, the pipeline might contain stale data, requiring a "pipeline flush" and refetch – another performance penalty.
- Environmental Restrictions:
- W^X Operating Systems: Simply cannot execute code from writable memory.
- Harvard Architecture Microcontrollers: Many microcontrollers have separate memory spaces for instructions (ROM/Flash) and data (RAM). You can't execute code directly from data RAM, making SMC impossible.
- Threading Issues: If multiple threads are executing the same piece of self-modifying code concurrently, they can interfere with each other, leading to race conditions on the code itself and unpredictable, incorrect results. Careful synchronization would be required, adding significant complexity.
Advanced Topics and Related Concepts
SMC exists within a landscape of advanced programming techniques that push the boundaries of standard compilation and execution models:
- Polymorphic Code: Code that changes its appearance with each infection or execution instance while retaining the same functionality. Often uses SMC (decryption and mutation) as a core mechanism.
- Just-In-Time (JIT) Compilation: Compiling code dynamically at runtime. While not strictly SMC (it typically generates new code blocks rather than modifying existing compiled ones), it shares the goal of runtime optimization and specialization based on observed behavior.
- Reflective Programming: The ability for a program to observe and modify its own structure and behavior at a higher level (e.g., changing object properties, method tables, or even defining new code constructs in some languages).
- Homoiconicity: A property of languages where the primary representation of programs is also a primary data structure (e.g., Lisp, Prolog). This makes code manipulation as data (like Lisp macros generating code) very natural.
- Monkey Patching: Dynamically altering a class or module at runtime, typically in dynamic languages. This modifies the behavior associated with a name but isn't necessarily altering the underlying machine code instruction stream of an existing function in place.
Conclusion: The Forbidden Knowledge
Self-modifying code is a powerful, intricate technique rooted in the early days of computing when hardware constraints demanded creative solutions. It offers unparalleled control over execution paths and enables extreme optimization and dynamic specialization.
However, in the era of complex operating systems, multi-core processors, sophisticated security models (like W^X), and emphasis on code maintainability, SMC has largely moved from mainstream practice to the realm of niche applications. It's found in low-level system code (bootloaders, some OS kernel features), specialized high-performance computing, evolutionary systems, and, yes, the "underground" – malware and anti-analysis techniques.
Understanding SMC is understanding a fundamental capability of executable code and the hardware it runs on. While rarely taught in standard curricula due to its difficulty, potential for bugs, and security implications, it remains a fascinating and potent tool for those who venture into the deeper layers of programming. Like any forbidden knowledge, it comes with great power and requires immense care. Tread carefully, and respect the dragons that guard the instruction stream.